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Abstract 

It has been proposed that neural noise in the cortex arises from chaotic dynamics in the 
balanced state: in this model of cortical dynamics, the excitatory and inhibitory inputs 
to each neuron approximately cancel, and activity is driven by fluctuations of the 
synaptic inputs around their mean. It remains unclear whether neural networks in the 
balanced state can perform tasks that are highly sensitive to noise, such as storage of 
continuous parameters in working memory, while also accounting for the irregular 
behavior of single neurons. Here we show that continuous parameter working memory 
can be maintained in the balanced state, in a neural circuit with a simple network 
architecture. We show analytically that in the limit of an infinite network, the dynamics 
generated by this architecture are characterized by a continuous set of steady balanced 
states, allowing for the indefinite storage of a continuous parameter. In finite networks, 
we show that the chaotic noise drives diffusive motion along the approximate attractor, 
which gradually degrades the stored memory. We analyze the dynamics and show that 
the slow diffusive motion induces slowly decaying temporal cross correlations in the 
activity, which differ substantially from those previously described in the balanced state. 
We calculate the diffusivity, and show that it is inversely proportional to the system size. 
For large enough (but realistic) neural population sizes, and with suitable tuning of the 
network connections, the proposed balanced network can sustain continuous parameter 
values in memory over time scales larger by several orders of magnitude than the single 
neuron time scale. 


Introduction 

The consequences of irregular activity in the brain, and the mechanisms responsible for 
its emergence, are topics of fundamental interest in the study of brain function and 
dynamics. In theoretical models of brain activity, the irregular dynamics observed in 
neuronal activity are often modeled as arising from noisy inputs or from intrinsic noise 
in the dynamics of single neurons. However, theoretical and experimental works have 
suggested that explanations based on sources of noise in intrinsic neural dynamics are 
insufficient to account for the stochastic nature of activity in the cortex [TJj^. An 
alternative proposal is that noise in the cortex arises primarily from chaotic dynamics at 
the network level [3||^. A central result in the field is that simple neural circuits with 
recurrent random connectivity can settle, under a broad range of conditions, into a fixed 
point called the balanced state [^[^-[^ : in this state the mean excitatory drive to each 
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neuron nearly balances the mean inhibitory drive, and neural activity is driven by 
fluctuations in the excitatory and inhibitory inputs. The overall dynamics are chaotic, 
resulting in an apparent stochasticity in the activity of single neurons, which can exist 
even in the absence of any sources of random noise intrinsic to the dynamics of single 
neurons and synapses. 

It remains unclear which computational functions in the brain are compatible with 
the architecture of the balanced network model, since this model assumes random, 
unstructured connectivity in its rudimentary form. The possibility that functional 
circuits in the brain are in a balanced state raises another important question: does the 
apparent stochasticity of single neurons in this state have similar consequences on brain 
function as would arise from stochasticity which is truly intrinsic to the neural and 
synaptic dynamics? 

Here we explore the effects of chaotic noise on continuous parameter working 
memory. This task is particularly sensitive to noise, yet neurons in cortical areas 
involved in the maintenance of continuous parameter working memory have been shown 
to fire irregularly during task performance 10,11 . Attractor dynamics are often put 


forward as a mechanism for the persistent neural activity underlying this task. 
Continuous attractor networks are dynamically characterized by a continuous manifold 
of semi-stable steady states, which make it possible to memorize parameters with a 
continuous range of values, such as an angle or a position 12 -18 . In such networks. 


noise in neural or synaptic activity can cause diffusion along the manifold of steady 
states, leading to gradual degradation of the stored memory 19 -21 . However, irregular 


activity in the balanced state does not arise from mechanisms intrinsic to neurons or 
synapses, but rather from chaotic dynamics, and its consequences for continuous 
parameter working memory are largely unexplored. For this reason we addressed two 
questions. First, we asked whether a neural network can possess a continuum of 
balanced stable states. Second, we investigated how, in this scenario, chaotic noise 
would affect information maintenance. 


Persistence in balanced networks 

The question of whether balanced networks can produce persistent activity has 
attracted considerable interest in recent years. Several works explored architectures 
which give rise to slow dynamics in balanced networks, characterized by the coexistence 


of multiple discrete balanced states 22 . In several recent works multi-stability resulted 


from the existence of clustered connectivity, and slow transitions were observed between 
the discrete semi-stable states 23 -25 . Other works [8 26 demonstrated that a discrete 
set of semi-stable states can be embedded in a balanced neural network, using a similar 
construction as employed in the classical Hopfield model of associative memory 


27 


A few works have addressed the possibility that balanced neural networks may 
generate slow persistent activity over a continuous manifold. Such dynamics were 
demonstrated in simulations of neural networks that included short-term synaptic 
plasticity 28 , or a derivative-feedback mechanism 29 Previous works have not 


demonstrated the existence of a continuum of steady states in a balanced neural 
network analytically, and it has remained unclear whether such a continuum can be 
obtained without evoking additional mechanisms (such as short-term synaptic plasticity, 
or derivative-feedback). In addition, the influence of the chaotic dynamics on the 
persistence of stored memory has not been analyzed. These questions are addressed in 
the present work. 

Below, we identify an architecture in which slow dynamics are attainable in a simple 
form of a balanced neural network. We prove analytically the existence of a continuous 
attractor in our model in the large population limit. In finite networks, we show that 
the chaotic noise drives diffusive motion along the attractor - leading, among other 
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Fig 1. Network architecture and parameters. A Two neural populations with 
rates ri, r 2 inhibit each other with synaptic efficacies — J. B Two coupled balanced 
subnetworks, each consisting of an excitatory and inhibitory population. Excitatory 
(inhibitory) connections are represented in blue (red). Mutual inhibition is generated by 
all-to-all connections of strength —J^/K/N from each inhibitory population to the 
excitatory population of the other subnetwork. As in [^, connections within each 
subnetwork are random with a connection probability K/N, 1 K N. Connection 
strengths are: Jee/VK, Jie/VK, Jei/VK and Jn/y/K according to the identity of 
the participating neurons. Without loss of generality, we chose Jee = Jie = 1 and 
define Jei = —Je, Jii = —Ji- An excitatory input \/KEo is fed into both excitatory 
populations 


consequences, to slowly decaying spike cross-correlations. We show that the diffusivity 
scales inversely with the system size, as predicted previously for continuous attractor 
networks with intrinsic sources of neuronal stochasticity. With a reasonable number of 
neurons and suitable tuning, our model network exhibits slow dynamics over a 
continuous manifold of semi-stable states, while exhibiting single neural dynamics which 
appear stochastic, as observed in cortical circuits. 


Results 

Reciprocal inhibition between two balanced networks 

Our neural network model is based on the classical balanced network model presented 
in Refs. [^[^[^. This model consists of two distinct populations of binary neurons, one 
inhibitory and the other excitatory. The recurrent connectivity is random and sparse, 
with a probability K/N for a connection, where N is the population size (assumed for 
simplicity to be the same in both populations), K is the average number of connections 
per neuron from each population, and the connection strength is ^ 1/^fK. For 
\ ^ K ^ N and over a wide range of parameters, the mean population activity settles 
to a fixed point (the balanced state) where on average the total excitation received by 
each neuron is approximately canceled by the total inhibition (to leading order in 

and the neural dynamics are driven by the fluctuations in the input. The single 
neuron activity appears noisy, neither of the populations is fully activated or 
deactivated, and the overall network state is chaotic. 

Despite the nonlinearities involved in the dynamics of each neuron, the population 
averaged activities in the balanced state are linear functions of the external input [^[^. 
We exploit this linearity to build a simple system of two balanced networks projecting 
to each other. The intuition comes from a simple model of a continuous attractor neural 
network consisting of linear neurons arranged in two populations that mutually inhibit 
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( 1 ) 


each other, Fig. The linear rate dynamics of this system are given by: 

rr = —r + Wr + E , 

where E = [Eq^Eq], i?o > 0 is an external input and 


W = 


0 -J 
-J 0 


( 2 ) 


For J = 1 the system has a vanishing eigenvalue, and the fixed points form a continuous 
line: ri + r 2 = Eq. 

The simple neural architecture of Fig. was used as a basis for modeling the 
dynamics of neural circuits responsible for memory and decision making in the 
prefrontal cortex 31-35 . In our model, a balanced subnetwork replaces each of the 


populations of Fig. and the inhibitory population in each subnetwork projects to 
the excitatory population of the other subnetwork. Fig. and Methods. Thus, the 
model consists of two reciprocally inhibiting balanced neural populations. 

We consider first a scenario in which the inhibitory connectivity between the two 
sub-networks is all-to-all. Therefore, the network includes a combination of strong, 
random synapses within each sub-network and highly structured, weak synapses 
between the two sub-networks. This scenario lends itself to analytical treatment of finite 
N effects (see below). Later on, we present results also for an alternative scenario, in 
which the connections between the two-subnetworks are sparse, random, and strong 
{Additional randomness in connectivity and inputs). 


Continuum of balanced states 

We first examine whether the two-subnetwork architecture can give rise to a continuum 
of balanced states. The parameters of the network connectivity in our model are 
summarized in Fig. and in Methods. The mutual inhibition between the subnetworks 
is assumed to be all to all, and the interaction strength is scaled such that the total 
inhibitory input to each neuron, coming from the opposing subnetwork scales in 
proportion to \/K. 

Similar to the case of a single balanced network [^, the mean field dynamics of the 
population averaged activities for TV —>■ oo and K ^ 1 are given by: 

niTii = -iJii + H{-Ui/yAFi ), (3) 

where mi{t) = [* = 1 (2) for the excitatory (inhibitory) population of 

the first subnetwork, and similarly i = 3,4 in the second subnetwork], af{t) is the state 
of neuron k in population i at time t, H (x) is the complementary error function, and Ui 
(oi) is the mean (variance) of the input to a neuron in population T, averaged over the 
population and over the random connectivity {Methods). This equation is an 
approximation which becomes exact in the limit K ^ oo. 

To check whether there exist parameters for which the system has a continuum of 
balanced states, it is convenient to write the steady state equations of the above 
dynamics, while making use of the assumption that K is large. In the limit iV —>■ oo 
these equations become linear {Methods): 

mi — JE'm2 — Jmi + Eq = 
mi — J/m2 = 

m3 — J£)m4 — Jm2 + Eq = 
m3 — Jjmi = 

By choosing the interaction strength between the two subnetworks to he J = Je — Ji, 
this system becomes singular, and has a continuum of solutions arranged on a line in 
the mean activities space, which represent a continuum of stable balanced states. 


0 , 

0 , 

0 , 

0 . 


( 4 ) 
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Finite K 


When K is large and finite, but still in the idealized limit of infinite N, the population 
dynamics remain deterministic, and are approximately described by Eq. but the 
nullclines corresponding to the steady state equations (Eq. 131 are now nonlinear. 
Therefore, a precise continuum of steady states cannot be established. However, if the 
nonlinear nullclines are close to each other, slow dynamics are attainable in a specific 
direction of the mean activity space. To make this statement more precise, we first note 
that the steady state equations always have a symmetric solution in which mi = m 3 
and m 2 = m 4 . If in addition, at this symmetric point, the slopes of the nullclines are 
identical {dm^/dmi = — 1 ), there is a vanishing eigenvalue of the linearized mean field 
dynamics (Eq.around this point {Methods). 

Under these conditions, the smallest eigenvalue of the linearized dynamics is 
expected to be small also in the vicinity of the symmetric point. In fact, even for 
moderately large values of K {K = 1000), the two nullclines nearly overlap over a large 
range of mi and m 3 , Fig. and[^. Figure demonstrates that in this case there is 
an eigenvalue close to zero within a wide range of locations along the approximate 
attractor, and therefore the dynamics are slow at any position along this range. Below, 
we denote by A the eigenvalue closest to zero of the linearized dynamics, evaluated at 
the symmetric fixed point. 

As observed in other continuous attractor neural network models, the slow dynamics 


are sensitive to the tuning of the recurrent connectivity 15 29 ^ (see also 
Discussion). This sensitivity is quantified by the dependence of A on the coupling 
strength between the two subnetworks. Figure (top panel) shows how A depends on 
the mutual inhibition strength J and on K: A is linear in J, and proportional to 
For K = 1000, J must be tuned to a precision of ^ 0.1% to achieve a time scale A“^ of 
several seconds, when the intrinsic time scale r is 10 ms. The real part of the next 
eigenvalue is negative, proportional to -s/AT, and is weakly dependent on J (Fig. [^, 
bottom panel). 

We find that the approximate line attractor is stable to small perturbations over a 
wide range of parameters. This is verified by observing that when linearizing the 
population dynamics (Eq. around positions along the approximate attractor, the real 
parts of the four eigenvalues are negative, and one of them is close to zero, reflecting the 
slow dynamics along the approximate attractor - as demonstrated in Fig. [^. 

As an illustration of the existence of a direction in mean activity space, along which 
the dynamics are slow, we numerically solved the mean field differential equations in the 
limit of infinite N and K = 1000, with injected white noise. Figure [^E-F) shows that 
the resulting mean activities trace a line in the mean activities space (E) and the 
dynamics along the line are slow (F). 


Diffusive dynamics in finite size networks 

Next, we consider the realistic situation in which N is finite in the two-subnetwork 
model, while still requiring that N K ^ 1. Instead of adding noise to the dynamics 
of each neuron, we ask whether the chaotic dynamics are sufficient to drive diffusive 
motion along the approximate attractor. This question is motivated by the fact that 
diffusive dynamics are observed in model neural networks of intrinsically noisy neurons, 
with a finite number of neurons [19] . In addition, this question is motivated by evidence 
of diffusive dynamics underlying continuous parameter working tasks - as observed both 
in the behavioral data and in its neural correlates in the prefrontal cortex 


10 II 36 


Since the population dynamics are no longer given by Eq. we performed large 
scale numerical simulations of networks with N ranging between 10"^ to 15 x 10"^ 
(additional details on the simulations can be found in Methods). To simplify the 
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A Projections of the nullclines rhi = 0 
Here 


Fig 2. Dynamics in the TV —>■ cx) limit. 

(blue) and TO 3 = 0 (red) on the mi — m 3 plane, based on Eq. 13 
K = 1000, Je = 4, Ji = 2.5, J = 1.5, te = 10 ms, tj = 8 ms, and Eq = 0.3. The 
same values of Je, Ji, Eq, te, and t/ are used throughout the manuscript. B Same as A, 
except that here J ~ 1.7, tuned to achieve a singular Jacobian at the symmetric point. 
C Real part of the four eigenvalues of the Jacobian evaluated at different points along 
the approximate attractor (here parametrized by the value of mi). Note that there are 
two complex conjugate eigenvalues, and their real parts overlap (red curve). D Top: the 
eigenvalue closest to zero as a function of the mutual inhibition strength. Different 
colors correspond to different values of K: 5000 (black), 1000 (green), 500 (red) and 100 
(blue). All other parameters are as in A. Bottom: real part of the eigenvalue next to 
closest to zero as a function of the mutual inhibition strength. E Integration of Eq. 
with injected uncorrelated Gaussian noise with a = 10“^ v/l0msec (Eqs. IsflO l, J = 1.5 
(black), J ~ 1.7 (blue). F Dynamics of the projection along the special direction 
(parameters and colors as in E). 
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Fig 3. Single neuron statistics. Top: Raster plot of 30 neurons from one of the 
excitatory populations in resting state. Here a spike is defined as a transition from an 
‘off state to an ‘on’ state. Bottom: inter-spike interval distribution of four 
representative neurons from one of the excitatory populations. A fit to an exponential 
function is shown as a solid red line. 


analysis and the presentation, we chose the random weights within each subnetwork 
such that they precisely mirrored each other, which ensured that the fixed point would 
be symmetric (mi = m 3 and m 2 = m 4 ). If, alternatively, the connections in each 
subnetwork are chosen independently, the fixed point deviates slightly from this 
symmetry plane (this deviation approaches zero for infinite networks). However, all the 
results described below remain qualitatively valid (see below. Additional randomness in 
connectivity and inputs). 

The neural activity observed in our simulations is irregular and individual neurons 
approximately exhibit exponential ISI distributions similar to those observed in the two 
population case, although their dynamics are deterministic (Fig. [^. To test whether the 
network can perform short term memory tasks, we initiated the population activities 
such that the network state was close to some point along the approximate line 
attractor. Figure]^ shows the resulting dynamics of the four populations: the 
activities persisted for a few seconds before decaying towards the symmetric fixed point. 
Figure]^ shows the projection along the slow direction, X{t) (defined in Eq. 17), again 
revealing the slow decay of the initial state. Figure]^ shows statistics of trajectories 
that start from two initial positions along the approximate attractor, when J is tuned 
to achieve ~ 9 s. The state of the network enables discrimination between the two 
conditions over a time scale of several seconds. The ability to do so with high confidence 
is influenced both by A and the stochasticity of the motion, which we characterize in the 
following section (see also Discussion). SI Fig shows the mean square displacement 
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(MSD) from the starting point for the same dataset, averaged over all trials. 

Long after initialization, the population activities fluctuate around the symmetric 
fixed point, along a line corresponding to the approximate attractor: a projection on the 
mi — m 3 plane is shown in Fig. |^. Fig. demonstrates that X{t) exhibits slow 
diffusive dynamics. To demonstrate that the dynamics are effectively one dimensional, a 
projection on a perpendicular direction is shown as well. 


A B C 






Fig 4. Dynamics of finite N Networks. A Mean activities of the four populations 
after initialization at a specific state along the approximate attractor. Blue (red) traces 
are used for the excitatory (inhibitory) populations. Dashed and solid lines are used to 
distinguish between the two subnetworks. B Projection along the approximate 
attractor, shown for the same simulation as in A. C Projected dynamics for two initial 
conditions: A(0) = 0.05 (blue shaded area) and A(0) = 0.08 (grey shaded area). Black 
lines are averages over 50 trials. Shaded areas represent the standard deviation 
measured over 50 independent trials. D Projection of the mean activities on the mi-m^ 
plane in resting state activity. E Dynamics of the projection X along the special 
direction (black), and a projection on a perpendicular direction (red) in the same 
simulation as in D. In A,B and C A = 1.5 x 10®, K = 500 and J ~ 1.77. In D and E 
N = 10®, K = 1000 and J ~ 1.69. Other parameters are as in Fig.j^ 


Statistics of the diffusive motion 

From here on we focus on the dynamics of X{t), the projected position along the 
approximate attractor. The dynamics of X can be characterized by the two moments: 


F{X, At) 
G(A, At) 


{X{t + At) - X{t)\ X{t) = X)^ , 


{[X{t + At)-X{t)f Xit) = X)t. 


(5) 

( 6 ) 


The first moment F characterizes the systematic component of drift along the 
approximate attractor, and the second moment G characterizes the random, diffusive 
component of the motion. Both moments may depend, in general, on the position X 
along the approximate attractor. Figures [^A-B) show measurements from simulations 
of F{X, At) and G{X, At) in the limit of small At, at various locations X. 

For small At and near the fixed point, we expect F{X, At) ~ —XXAt with constant 
A, where in the limit of large N and K, A becomes equal to the smallest eigenvalue of 
the deterministic linearized dynamics at the symmetric fixed point. In fact, this relation 
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holds to a very good approxiniation over a wide range of positions along the 
approximate attractor (Fig. HM; 

The moment G{X, At) (Eq. characterizes the diffusion along the approximate line 
attractor, and is the focus of our analysis in the rest of this section, since it quantifies 
the random aspect of the dynamics, driven by the chaotic noise. Figure shows 
measurements of this quantity from simulations (for small At), over a wide range of 
positions along the approximate attractor. Note that for the parameters we use and our 
choice for the parametrization of X (Methods), the range of X is approximately 
[—0.2,0.2]. Figure shows measurements of G(X,At) for A = 0, over a wide range of 
time intervals. 


Short time scales 

Our main interest lies in the diffusive motion over long time scales compared to r. 
However, we consider first the diffusive motion over short time scales, quantified by 
G(X, At) for At < r, since in this case the behavior of G can be expressed exactly in 
terms of the averaged autocorrelation function, qj{At) = (^ + ^'t)^i (t)) 

(Methods). Using the mean field theory, it is possible to derive a differential equation for 
dit) [z], which can be solved numerically. Using the numerical solution, we obtained a 
prediction for G(X, At) which is in excellent agreement with measurements from 
numerical simulations. Fig. [5p. Note that there are no fitting parameters in this 
calculation. 

This analysis leads to two conclusions, which are important for the analysis that 
follows below: first, G is proportional to At for small At. Second, G is inversely 
proportional to N, the size of the neural populations. A similar derivation can be 
applied also to the single balanced network discussed in [^, for At <C r (upper inset in 

Fig. [5)3). 

In addition, we note that G(X, At —>■ 0)/At is approximately constant along the 
approximate attractor, as seen in Fig. [^. Therefore, in most of the numerical results 
below we focus on G near the symmetric fixed point (X = 0). 


Diffusion over arbitrary time scales 

On time scales larger than r, the behavior of the two coupled balanced subnetworks 
differs dramatically from that of the single balanced network: in the single balanced 
network G saturates for At > t (Fig. [^, upper inset), whereas in the two coupled 
balanced subnetworks G continues to increase as a function of At, up to At of order 
(Fig. [^, main plot). Thus, the diffusive motion generates correlated activity over time 
scales much longer than r. Because the chaotic noise itself is uncorrelated on time scales 
longer than r (as shown more precisely below), and since A is approximately constant 
along the approximate attractor, we may expect the motion to approximately follow the 
statistics of an Ornstein-Uhlenbeck (OU) process. This approximation provides a good 
fit to the dynamics. Fig. HP, as expected. This made it possible to extract a diffusion 
coefficient D from the simulations which characterizes the random motion on time scales 
T < At < A“^. Furthermore, since D and A are approximately constant over a wide 
range of positions along the approximate attractor (Figs. §A -B]), the approximation as 
an OU process provides a precise and compact description of the trajectory statistics, 
from which the performance of the network in retention of memory can be deduced (see 
Discussion and Figs.J^, SI Fig, S2D Fig, S3E Eig and S4[B,C] Fig). 

According to Eq. 28 (Methods), fluctuations in the mean activity scale as 1/A, but 
this equation is valid only for time scales smaller than r, whereas the diffusion 
coefficient D characterizes fluctuations on longer time scales. Eigurej^ demonstrates 
that the 1/A scaling holds also for the diffusive motion over long time scales: the 
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Fig 5. Statistics of motion along the approximate attractor. A-B Statistics of 
motion along the attractor over short time scales: mean rate of drift F(A, At —)■ 0)/At 
(A), and random diffusion G{X, At —>■ 0)/At (B), measured in numerical simulations as 
a function of the location X along the approximate attractor. A linear fit is shown in 
red in both panels. Here N = 30000, K = 1000, At ~ 3 ms. C Numerical measurement 
of G{X = 0, At) (Black trace. The shaded blue area designates the standard deviation 
of the mean) and a fit to the statistics of an OU process (red). D Measurement of 
G{X, At 0) (black rings, error bars are smaller than the rings if not shown), compared 
with the analytical expression, Eq. 28 (red trace). Here A = 1.2 x 10®. E Measurements 
of G{X, At) from simulations (Black trace. The shaded blue area designates the 
standard deviation of the mean, same as in C), compared with the semi-analytical 
approximation (red), Eq. 36 {N = 1.2 x 10®). Lower inset: zoom-in on At < r. Upper 
inset: measurement of G{X = 0, At) from a single balanced network. Here we chose 
X = mi(t) — {mi{t))t- F Diffusion coefficient (extracted from a fit to an OU process), 
shown as a function of N. Symbols: simulations, red trace: fit to ^ 1/N dependence. 
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Fig 6. Cross correlations in a single balanced network. Population averaged 
cross-correlations of neural activity in a single balanced network, C""(At) (Eq. [^, 
shown as a function of the time lag At (in units of r = 10 ms). The cross-correlation 
functions are multiplied by N, to demonstrate that Cij scale as 1/iV: with this choice of 
scaling, measurements from simulations with different values of N collapse on one curve. 


diffusion coefficient, extracted from a fit to the statistics of an OU process, is inversely 
proportional to N. The same scaling with N has been observed in continuous attractor 


networks with intrinsic neural stochasticity 19 . Another important implication of the 


1/N scaling is that sufficiently large networks can reliably store a continuous variable in 
short-term memory (see Discussion). 

To understand this result in more detail, we start by considering the time dependent 
correlation functions of mi in a single balanced network: 


1 


N 


N2 

k,l=l 


(7) 


where i,j € {Ex,Inh}. An analytical expression for these correlation functions is not 
available (see for further discussion). Therefore, we measured them numerically in 
simulations of activity in the single balanced network architecture. These measurements 
indicate that the time-dependent correlation functions decay over time scales of order t, 
and that they scale as 1/N, Eig.|^ 

Next, we show that the statistics of diffusion in the coupled system can be expressed 
precisely in terms of the correlation functions of the single, uncoupled balanced 
networks (Methods). Thus, the correlation structure of the chaotic noise in the single 
balanced network determines the statistics of the slow diffusive motion along the 
approximate attractor in the coupled two-population network. 

Using the noise cross correlations measured in simulations of a single balanced 
network, it is possible to obtain a semi analytical approximation for G in the system of 
two coupled subnetworks, which does not involve any fitting parameters. The 
measurements of G from simulations are in excellent agreement with this analytical 
prediction, Eig.l^. The above analysis indicates that the 1/N scaling of the diffusion 
coefficient (Fig.j^) is a consequence of the decay with N of cross correlations in 
activity of different neurons in the single balanced network. In this sense, for large N 
the network behaves as a collection of neurons with independent random noise, although 
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Fig 7. Cross correlations in the conpled subnetwork architectnre. A 

Auto-covariance of the mean activity of one of the excitatory populations (black). Cross 
covariance between the first excitatory population and the first inhibitory population 
(green), the second inhibitory population (red) and the second excitatory population 
(blue). B-D Pairwise correlation of neuron activity in one of the excitatory populations. 
Correlations are measured in n neurons, selected randomly, and averaged over all pairs 
(Methods), where n = 10 (B), 50 (C), and 100 (D). The simulated time is 15 minutes. 
An average over the entire population is shown in grey. In all panels TV = 1.2 x 10®, 

K = 1000, A-i ~ 2 sec. 

the source of this apparent noise is the chaotic activity generated by the recurrent 
connectivity. 

Spike correlation functions 

The diffusion along the approximate attractor implies that the population activities are 
correlated over long time scales, up to order A“^: Fig. shows examples of the 
population correlation functions C’”, which differ dramatically from those of the single 
balanced network. Fig. (note the different time scales in the two sets of figures). 

Spike trains generated by single neuron pairs are correlated over long time scales as 
well, since all neurons in the network are coupled to the collective diffusion along the 
approximate attractor. However, a reliable observation of the slowly decaying 
correlation in a single pair might require an unrealistically long recording time. This 
difficulty can be overcome potentially by considering the simultaneous activity of 
multiple neurons: for example, we find in our simulations of a network with N = 10® 
that for 15 minutes of simulated time, a simultaneous recording from ~50 or more 
neurons from each population would be sufficient to reliably observe the slow temporal 
decay of the correlations. Fig. C, whereas a simultaneous recording from ten neurons 
over 15 minutes may be insufficient. As demonstrated in Fig. (B-D) the noise falls as 
one over the number of measured neurons and as one over the total recording time. 
Hence, by extrapolating from the results in Fig. 06 -D), ^12 hours of recording would 
be required to obtain a measurable correlation signal from a single pair of neurons. 
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Fig 8. Chaotic nature of the noise driving the diffusive motion. A Projection 
X of the mean activities on the approximate attractor in 30 trials with the same update 
schedule and the same initial conditions, except for one neuron which was flipped in 
each population {N = 10^). B Variance over 1500 trials as a function of time, with 2cr 
error bars (gray). Red: fit to the variance of an OU process {D ~ 3.4 • 10“®l/10s ). 


Chaotic behavior 

Next, we briefly address the chaotic nature of the noise that drives diffusive motion. 
Figure |8]4 shows results from multiple simulations in which the initial network state 
differed solely by a flip of one neuron in each population (out of ^ 10® neurons). All 
other parameters, including the asynchronous update schedule and the network weights 
were identical across runs. The time dependence of the variance across different runs is 
similar to the variance over realizations of an OU process. Fig. |^, with a similar 
diffusion coefficient as observed in the fit for G{X,At), Fig. [^C,E). Thus, the different 
initial conditions are equivalent to different realizations of dynamic noise that drives 
diffusive motion along the approximate line attractor. 

Additional randomness in connectivity and inputs 

In addition to the results described above, we investigate several scenarios in which we 
introduce additional randomness, either frozen or dynamic. First, we relax the 
assumption that connections in the two sub-networks precisely mirror each other. This 
assumption was made above for convenience: the precise identity of the synaptic 
connections simplifies the numerical analysis since it ensures a precise symmetry of the 
dynamics around the hyperplane A = 0. S2 Fig demonstrates that the main conclusions 
of our analysis remain valid when the connectivity in each sub-network is drawn 
independently: dynamics are slow along the approximate line attractor, and the 
diffusion coefficient along the line scales as 1 /N with a prefactor which is somewhat 
larger than the value observed in Fig. 

The main new feature that arises when the synaptic connections are drawn 
independently in the two sub-networks, is that the relaxation point of the dynamics 
along the approximate attractor deviates from the hyperplane X = 0. The 
characteristic magnitude of this deviation decays monotonically to zero with increase of 
the system size N. We note that the mean field description of the dynamics for finite 
K ^ 1 and in the limit iV —>■ oo, is identical to the mean field dynamics associated with 
the perfectly symmetric scenario. 

Next, we consider a scenario in which the inhibitory connections between the two 
sub-networks are random, and follow the same basic architecture as the connections 
within each sub-network. Therefore, instead of assuming weak all-to-all connections of 
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order y/K/N^ we include random connections of order with a probability K/N 

for a synaptic connection. In addition, we relax the assumption of mirrored connections 
in the two sub-networks. In this case it is straightforward to show that the mean-field 
equations remain identical to those associated with the case of all-to-all connections, in 
the limit N —)■ oo, K —)■ oo. Therefore, a continuous set of balanced states can be 
achieved {Methods). 

When K is large and finite, the mean-field equations are slightly different in the two 
scenarios (Methods). The main outcome of this difference is a shift of the unstable fixed 
points of the dynamics from the planes m-i = m 2 = 0 and m 3 = rrii = 0. Consequently, 
there is a certain degree of activity in both sub-networks even in the unstable fixed 
points of the dynamics. This is shown in S3 Fig. More significantly, S3 Fig 
demonstrates that in the case of random and sparse connections between the two 
sub-networks, the dynamics exhibit the same characteristics as in the case of all-to-all 
connectivity, and can be accurately approximated as an OU process over time scales 
longer than r. The coefficient of diffusion D scales linearly with 1/A^, with a prefactor 
which is close to the one observed in S2 Fig. 

Finally, we explore the effects of stochasticity in the input Eq to the network (see 
Figj^). S4 Fig demonstrates that even when the inputs include a large degree of 
temporal variability, and the noise injected to all the neurons in the network is highly 
correlated, the network exhibits slow dynamics along an approximate attractor, with 
statistics that are qualitatively similar to what we present above. 


Discussion 


In summary, we demonstrated that a simple balanced network can exhibit slow 
dynamics along a continuous line in the population mean activity space. In finite 
networks, the chaotic dynamics drive diffusive motion along the approximate line 
attractor. We calculated the diffusivity in the system, based on the correlation structure 
observed in a single balanced network, and showed that the diffusion coefficient along 
the approximate attractor is inversely proportional to the network size. This is similar 
to the effect of noise that arises from intrinsic neural or synaptic mechanisms 


19 


The slow diffusive motion along a one-dimensional trajectory induces correlations 
within the populations, and in single neuron pairs, that persist up to a long time scale 
set by the decay time A“^. This property characterizes the dynamics of the system even 
when it is at the resting state {X ~ 0), i.e., when the network is not engaged in a 
memory task. Hence, this observation generates a prediction for spontaneous activity in 
brain areas such as the prefrontal cortex, in which continuous attractor dynamics, based 
on mutual inhibition between two populations, have been postulated 


33 


A slowly decaying cross-correlation function characterizes the spikes produced by 
any pair of neurons in the network, but due to the high irregularity in the single neuron 
activity it may be necessary to average over multiple simultaneously recorded neuron 
pairs in order to obtain a clear measurement over a realistic time scale for a single 
experiment (in Fig. [!P> 50 neuron pairs and a ^15 minute measurement). Furthermore, 
it is important to label the neurons based on their functional properties: averaged over 
all populations, the cross correlations seen in Fig. [34 cancel. In brain areas involved in 
short term memory tasks, this labeling can potentially be achieved by first measuring 
the tuning curves of neurons as a function of the stored variable. 


Linear vs. nonlinear neural response in decision making circuits 

Several models of decision making circuits in the prefrontal cortex were based on the 
simple neural architecture of Fig. UK [32|. This network architecture can precisely 
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generate a continuous attractor if the activity of single units is a linear function of their 
input. While the linear dynamics provide a simple intuition for the principles 
underlying continuous attractor dynamics in recurrent neural networks 31 , it is more 


difficult to obtain a continuum of steady states using the above network architecture 
when single neural responses are nonlinear. Therefore, specifically tuned forms of 
nonlinearity 33 35 , or more elaborate network architectures - still based on mutual 
inhibition between two or more neural populations have been proposed 33 34 . In this 


context the linear input-output relationship, characterizing the single balanced network 
of Ref. [^ , is a useful computational feature that facilitates the construction of a 
continuous attractor network based on the simple architecture of Fig. [T^. However, the 
main motivation for considering the balanced state in this work lies in its ability to 
account for the irregular spiking of single nenrons in cortical circuits. 


Time scales of persistence and retention of information 

Continuous attractor networks are an important model for maintenance of short-term 
memory in the brain. The memory is represented by the position along the attractor, 
and therefore the stochastic motion along the attractor determines the fidelity of 
memory retention. Since the dynamics of our proposed network are well characterized 
as an OU process over time scales longer than r and over a large range of positions 
along the approximate attractor, it is straightforward to assess how the position along 
the approximate attractor evolves in time. All aspects of the trajectory can be easily 
inferred based on the initial state along the approximate attractor, the time interval, and 
the two parameters which characterize the OU process: A and D. Similar considerations 
can be applied also for noisy continuous attractors in which the stochasticity arises from 
mechanisms other than the chaotic dynamics studied here 19 20 . We next discuss how 


the decay time and the diffusivity depend on the parameters of our model. 

The decay time can be calculated exactly in the limit of iV —oo, AT 3> 1 
(Fig.[2f. It is interesting to note that there are competing influences of K on the tuning 
of the attractor: with increase of K, A“^ becomes more sensitive to J. However, when 
K is reduced, the nullclines (Fig. become less linear, causing deviations from the 
ideal behavior far from the symmetry point. We showed that for K ^ 10^ and 
T = 10 ms it is possible to achieve persistence over several seconds, while the decay time 
and diffusion coefficient are approximately constant along a wide range of positions. 
This requires to tune J to a relative precision of order 0.1%. 

Since all times scale linearly with the intrinsic time constant, longer persistence 
times (or a weaker tuning requirement) can be achieved if the intrinsic time constants of 
individual units is longer that the value of 10 ms assumed in our examples. Intrinsic 
neural persistence or slow synapses could potentially contribute to this goal under more 
realistic biophysical descriptions of the neural dynamics. 

Finally, we note that the requirement for precise tuning of the connectivity is a 
characteristic featnre of all continuous attractor models. Several works have proposed 
ways to achieve tuning through plasticity mechanisms 37 38 , or ways to stabilize the 


dynamics by additional mechanisms such as synaptic adaptation 28 39 40 or negative 
derivative feedback 


29 30 , in order to increase the persistence time. 


Diffusive motion The diffusive motion along the approximate attractor, which is 
the main focns of this work, poses an additional limitation on the persistence of short 
term memory. While appropriate readout mechanisms may be able to take into account 
the systematic drift caused by the decay towards to the symmetry point, random 
diffusion inherently degrades the information stored in the position along the attractor. 

With 10^ neurons per population, random diffusion over an interval of one second 
causes a deflection in X with a standard deviation of ~ 10“^. This quantity should be 
compared with the possible range of X, which is approximately [-0.2,0.2] in our 
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parametrization of the position along the approximate attractor (we verified that the 
dynamics are accurately approximated as an OU process over the range [-0.1,0.!]). 
Therefore, with the tuning chosen in Fig. |^, where ^ 10 s, and with N = 10^, the 
limiting factor for discrimination between nearby stimuli after a delay period of order 1 s 
is the diffusive dynamics along the approximate attractor. 

Random diffusion in continuous attractor networks of Poisson neurons is also often 
very significant 19 20 . The diffusivity can be suppressed by increasing the number of 


neurons, increasing the intrinsic time constant of individual neurons and synapses, or by 
assuming that the firing of individual neurons is sub-Poisson 19 41 . Our proposed 


model for a line of persistent balanced states similarly predicts a significant degree of 
random diffusion, highlighting the need to better understand how noise influences the 
retention of continuous parameter memory in cortical circuits. 

There are several ways in which the random, diffusive component of the motion can 
potentially be reduced: First, by increasing the number of neurons. We presented 
results for networks containing (altogether) up to 6 x 10^ neurons, and it is 
straightforward to extrapolate our estimates for D to larger networks based on the 1 /N 
dependence of the diffusion coefficient. Second, the diffusion coefficient is expected to 
decrease significantly if slow synapses participate in the dynamics 19 , or if the intrinsic 
time constant of the neurons is increased. Third, additional mechanisms such as 
synaptic adaptation 28,40 or derivative feedback 29 may perhaps contribute to a 
reduction in the diffusivity. Finally, an intriguing possibility is that highly structured 
and tuned connectivity can yield improved robustness to noise in a balanced state, as 


hinted by recent results on predictive coding in spiking neural networks 42 


Methods 

Model 

In our model, two balanced neural subnetworks inhibit each other reciprocally: the 
inhibitory population in each subnetwork projects to the excitatory population of the 
other subnetwork. Fig. 

As in Refs. [^[^, the neurons are binary and are updated asynchronously, at update 
times that follow Poisson statistics. The mean time interval between updates is te (tj) 
for neurons in the excitatory (inhibitory) populations. In each update of a neuron k 
from population i, the new state of the neuron is determined based on the total 
weighted input to the neuron, 

= ©(«?), ( 8 ) 

where 0 is the Heaviside step function, and is the total input to the unit at that time, 




Ni 

E 

i=i 




-Tk. 


(9) 


Here, Tk is the threshold and Eq is an external input. We chose the external input to be 
zero for the inhibitory populations and to be positive (and constant) for the excitatory 
populations. Connections within each network are random with a connection 
probability K/N, where 1 <C AT <C A^. Here N is the population size (chosen to be 
equal in all populations for simplicity) and K is the average number of inputs a neuron 
gets from each population. Connection strengths are: Jee/VK, Jie/VK, Jei/VK 
and Jii/'/K according to the identity of the participating neurons. Without loss of 
generality, we have chosen Jee = Jie = 1 and defined Jei = —Je, Jii = —Ji- Mutual 
inhibition is generated either by weak all-to-all connections (in all figures except for 
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S3 Fig), or by strong random and sparse connections (S3 Fig). In the former scenario, 
synapses of strength —J^/K/N connect each inhibitory neuron to all excitatory neurons 
in the excitatory population of the other subnetwork. In the latter scenario (S3 Fig), 
connections from each inhibitory population to the excitatory population of the other 
subnetwork are chosen randomly with a connection probability K/N, and with strength 
—J/'/K. An excitatory feed-forward input '/KEq is fed into both excitatory 
populations. We denote by Ui the mean of u!^ over all the neurons 
population i and over realizations of the connectivity: 

ui = VK{mi — JEm2 — Jmi + Eq) — Ti , 

U 2 = '/K{mi — Jim 2 ) — T 2 , 

U 3 = \/K{m 3 - JETUi - Jm 2 + Eq) - T 3 , 

U4 = ^/~K{m3 — Jjmi) — Ti . 


k within the 


( 10 ) 


Here are the population averaged activities. The variance of over all the neurons 
k within the population i and over realizations of the connectivity is given (to leading 
order in K/N) by: 

ai = mi + J]^m2 , 
a2 = mi + J]m2 , 
as = m3 + J%mi , 
ai = m 3 + J]mi . 

These expressions are obtained in similarity to the derivation of the variances in Ref. [^. 
Note that the all-to-all inhibitory connections between the subnetworks contribute only 
terms of higher order in K/N. In the scenario where the connections between 
subnetworks are randomly drawn (S3 Fig), the variance of the input to the excitatory 
neurons includes an additional term, due to the variability of inhibitory synapses from 
the opposing sub-network. In this scenario 


ai = mi + J%m2 + J^mi , 
0i2 = mi + Jfm2 , 

0-3 = ^3 + j/^mi + , 

ai = m3 -I- Jfmi . 


( 12 ) 


The mean field equations written below are valid both for all-to-all and for random 
connections between sub-networks, with the appropriate choice of a^. 


Line of balanced states in the limit N ^ K ^ 1 

To check whether there exist parameters for which the system has a continuum of 
balanced states, it is convenient to write the steady state of equation (|^ as follows: 

mi — JEm2 — Jmi -f Eq 
mi — Jjm2 
m3 — Jsmi — Jm2 + Ed 
m 3 — Jjmi 

Taking the limit AT —)■ 00 while requiring that none of the populations is fully on or off 
produces a linear system of equations for the mean activities, Eq. When J = Je — Ji 
the system of linear equations is singular. In this case the steady state equations admit 
a continuum of solutions which comprise a continuum of stable balanced states. A 


= ^[Tl-^H-\mi)] , 

= ^[T‘2-V^H-\m2)] , 
= ^[T3-^H-Hm3)] , 

= ^[Ti-^iH-^{mi)] . 
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possible parametrization of the line of balanced states is given by: 


mi = X , 

TO2 = X/Jl , 

TO3 = -X + JiEo/{Je - Ji), 

1 X 14 = -x/Ji + Eo/{Je-Ji)- (14) 


The conditions Je — Ji > 0, Jj > 1, and 0 < JiEo/{Je — Ji) < 1 ensure that for 
0 < X < JiEo/{Je — Ji) the mean activities are positive and none of them is equal to 0 
or 1. 

In Fig. gE -F), we artificially add to the dynamics (Eq. |3| white Gaussian noise as 
follows: 

niTii =-rrii + H{ui/y/(Xi) + (15) 


where (^i(t)) = 0 and 

(C*(t)0(t')) = cr'^SijS{t - t'). (16) 

Note that this was done only in Figs. [^E-F), to illustrate the existence of slow 
dynamics along a line in the case of infinite N. There is no injected noise elsewhere, 
and in particular there is no injected noise in our simulations of finite N networks. 


Simulations and statistics of the diffusive dynamics 


Our results for networks with finite N are based on large scale numerical simulations. 
In each simulation the connections were chosen randomly as described in the text, and 
an asynchronous update schedule was generated by a Poisson process. Parameter values 
are specified in the legend of Fig. [^in the main text. Averaged population activities 
were calculated online. The projection along the approximate attractor was defined at 
each time point as 

X{t) = vl ■ [mit) - mo] , (17) 


where m{t) is the measured 4 dimensional averaged population activity, mg is the 
vector of mean population activities at the symmetric fixed point, and Vq is the left 
eigenvector of the linearized dynamics with an eigenvalue close to zero. We chose the 
following normalization for the corresponding right eigenvector (see eq. 14): 


1/Ji 

-1 

V -IM / 


(18) 


and the normalization of Vq was chosen such that the dot product of the left eigenvector 
and the right eigenvector equals unity. 

Measurements of G(X, At) (Eq. were done in the following way: for each value of 
X we found all the time points for which \X{ti) — A| < 5, using a small 6 ~ 10“^. 
Then, for each such ti we calculated [X{ti + At) — X{ti)] , and averaged all these 
values to get G{X,At). Subsequently, we averaged over multiple simulations with 
different quenched noise and update schedules. In the manuscript we present results for 
G(0, At), but in similarity to F{X, At)/X, G{X, At) was fairly uniform along the 
approximate attractor. A similar calculation was performed to measure the drift 
F{X,At). In Eigs.[^C-F), results are based on (1 — 2) x 10^ simulations with random 
initial conditions, each spanning a simulated time of about 10 seconds. In simulations of 
the finite N network, we estimated A from measurements of F{X At) near the 
symmetric fixed point, and tuned J to obtain r. In Fig.^ A“^ ~ 2 seconds. 
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Measurement of cross covariance functions 

The cross covariance functions shown in Fig. were calculated in the following way: 
first, we measured the activity of n neurons at M equally spaced time points, with a 
time difference At = 66 ms. Then, for each pair of neurons i,j of populations k,l 
respectively, we calculated the unbiased estimate of the cross covariance: 




M —\m\ — l 


M — Ito 


a=0 


M-1 


M S 


a=0 


M-1 




a=0 


Here = oAt. Now, we averaged over all the measured pairs: 

1 




0.5n(n — 1) 




(19) 

( 20 ) 




For the calculation of the entire population averaged cross covariance we used the 
measured mean activities: 


M—|m| —1 




M - 


a—O 


(^a+m)^/ (^a) 


M-1 


M E 


a—0 


M-1 


M E 


a=0 


( 21 ) 

where mi{t) = i/NJ2i=i Note that the sum in Eq. 21 includes the 

auto-covariances, while the expression in Eq. [^does not. However, the contribution of 
the auto-covariances is negligible in the entire population average, since its contribution, 
relative to the contribution of cross-covariances scales as \/N. 


Diffusion over short time scales 

To analytically evaluate G(A, At) (Eq. over short time scales, we start by writing the 
change in the state of the fc-th neuron in population t in a short time interval At as: 


af (t + At) - af (t) = cf (t) [0?(t) - af (t)] , 


( 22 ) 


where 0f^(t) is the outcome of an update if it occurs, and c^{t) is a random variable 
equal to 1 if the t-th neuron was updated between t and t -I- At and to 0 otherwise. The 
updates occur each r ms on average, so that 


(cfw), = <»‘(()T = ^ 

‘ k 


whereas for i ^ j and/or k I, 


(c/(t)c'(t)) = 


(M: 

TkTl 


(23) 


(24) 


Now the mean squared displacement, G(X, At), can be written as: 

1 

—1 k,l — l 

(25) 

where u/ is the i’th component of the left eigenvector of the Jacobian with eigenvalue 
close to zero. From Eqs. 23p4 we see that for At <C t the contribution of elements with 


- 'i iV 

([^(t+At) - A(t)]^^ = ^ E E K-(^+> 
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i = j, k = I dominates the sum. To leading order in At we have 


Defining 


we obtain: 




X 


1 


4 N 




(|[crf(t + At)-crf(t)]^^ . 


ft (At) 


1 ^ 


G(A, At) 



(26) 


(27) 


(28) 


Diffusion over arbitrary time scales 

To derive an expression for the diffusive dynamics over arbitrary time scales, we start by 
representing the stochastic linearized dynamics of a single balanced network, near the 
symmetric fixed point, as a two dimensional stochastic process: 

5m = Bi 8 m + B 26 E + ^ ^ (29) 


where 5m is the deviation of the mean activities from the fixed point and 5E is the 
deviation of the input from the constant input Eq. Here Hi is a 2 x 2 matrix 
representing the response to perturbations in m, and H 2 is a 2 x 2 matrix representing 
the response to perturbations in the feedforward input. Both are obtained analytically 
from a linearization of the mean field dynamics. Finally, ^ is a random process with 
vanishing mean, whose covariance functions C^{At) are stationary and are yet 
unspecified: 

= . (30) 


Using Eq. 29 it is straightforward to relate C^{t) to the covariance of the activities 


(while assuming constant feedforward input, SE = 0): 


Cdt) = -^G-(t) + ^[G“(t)Hf-HiG-(t)] 
+ HiG™(t)Hf. 


(31) 


Using the measurements of G™ from simulations, we can thus obtain G^ numerically, 
using the above equation. In similarity to G™, G^ decays to zero over a time scale of 
order r. Altogether, Eq. describes the stochastic dynamics of a single balanced 
network close to the balanced state, in response to small fluctuations 5E in the 
feedforward inputs. In the two-subnetwork architecture, each subnetwork is coupled 
only to the mean activity of the other subnetwork, because of the all-to-all connectivity. 
More specifically, the mean activity of each subnetwork linearly modulates the external 
input to the excitatory population of the other subnetwork. Therefore, we can 
approximate the state of the 4-population network as a stochastic process with the 
following dynamics: 


5m = A5m -f ^, 


(32) 


where 5m is now a 4 dimensional vector, whose first (last) two entries represent the 
state of the first (second) subnetwork, and A is the Jacobian of the full 4 dimensional 
dynamics around the fixed point (Eq. 38), related in a simple manner also to the 
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matrices B 12 defined above. The correlation matrix of the 4 dimensional noise vector ^ 
is given by 




Q 0 
0 


(33) 


where is the 2x2 noise correlation matrix (30) evalulated for a single balanced 
network receiving fixed excitatory input, equal to the mean input to each subnetwork at 
the symmetric fixed point. Finally, we use this description of the dynamics to predict 
the statistics of diffusion along the line. We multiply equation from the left by Vq, 
the eigenvector with the eigenvalue close to zero, which we denote by A (note below that 
A < 0): 


X = AX 




(34) 


Here, X = ■ 6m. Thus, we obtain using the Wiener - Khintchine theorem the time 

dependent correlation function of X, 


1 / 

= e^^*^VQC^{t-t')vodt'. (35) 

J — 00 

Finally, the diffusion over an arbitrary time interval At is given by: 

([X(t + At) - X(t)]') = 2 [C'x(O) - C;c(At)] . (36) 


Proof: drrii/dm^ = — 1 F-)- vanishing eigenvalue 


Here we show that when dmi/dm^ = — 1 at the symmetric point (mi = m 3 , m 2 = mi), 
the Jacobian matrix has a vanishing eigenvalue, leading to slow dynamics near the fixed 
point. We denote: 


In terms of these quantities, the 


fi,j — 


dH {-u,/^i) 


dm A 


Jacobian matrix can be written as 


(37) 


A = 


//i.i - 1 
/ 2 . 1 /T 
0 

V 0 


/ 1.2 0 

(/2,2 - 1)/t 0 

/a ,2 /a ,3 — 1 

0 / 4 , 3 /T 


/l,4 \ 

0 

/a ,4 

(/4,4 - 1)/t/ 


(38) 


At the symmetric fixed point /i,i = /a, 3 , / 2,2 = / 4 . 4 , /i ,2 = /a, 4 , /i .4 = /a, 2 , and 
/ 2.1 = / 4 , 3 - The Jacobian’s eigenvalues at that point are then: 


\± _ 1 
— 2 

\fl,l - 1) + 


\j ((/l.l 1) ^ ) 1- ..fl,2/2,1 

± y/l. 4 / 2.1 


(39) 


Next, we approximate the derivative dmi/dm^ at the symmetric point. We use a 
first order Taylor expansion of the mean field equations to get: 


Smi 

= fi.iSmi + fi, 2 Sm 2 + fi,ASmA , 


Sm2 

= f2,iSmi + /2,2<5m2 , 


Sms 

= fi,ASm2 + fi.iSms + fi,2SmA , 


Sm4 

= f2.2SmA + f2,iSms ■ 

(40) 


Here, Smi are the small deviations from the symmetric fixed point. Using these 
equations we can write Sm 2 and Jmi as functions of Smi and Sm^: 

6m2 = 6mi ; Jmi = Sms ■ (41) 

I — /2,2 t — 72,2 
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Plugging these expressions into the equation for 6mi yields an expression for 6mi as a 
function of Sm^. The derivative is: 


dm I _ _ /i,4/2,1 _ 

dm3 /l,l/2,2 — /l,2/2,1 + 1 — (/l,l + /2,2) 


(42) 


Note that dmi/dms = dm 2 /dm 4 _. Equating this derivative to —1 (noting that in this 
case, dmi/dm 3 = dm 3 /dmi) yields: 


/l ,4 = 


/l,2/2,1 — 1 + /l,l + /2,2 — fl,lh,2 

/2,1 


By inserting /i ,4 into 39 we get A_ = 0. 


(43) 
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Fig SI. Mean squared displacement of location along the approximate 
attractor. (Same dataset as in Fig.4C in the main text.) A-B The mean squared 
displacement (MSD) of the location along the line for initial location A(0) = 0.05 (A) 
and A(0) = 0.08 (B) as a function of time. Error bars represent the standard deviation 
of the mean (black). Red: fit to an OU process. OU parameters from fit: 

D = 1.85 X 10“® (10 ms)“^ , A = 10“^ (10 ms)“^ in both panels. 
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Fig S2. Non-mirrored connectivity. Results for a network in which the internal 
connectivity in each sub-network is drawn independently. A Population averaged 
activity projected onto the mi — m^ plane. B Mean activities of the four populations: 
blue for one sub-network and red for the other. The higher activities are those of the 
excitatory populations {mi and m 3 ). C Projection along the approximate attractor. D 
G{X = 0, At) vs. At as measured from simulations (black). Error bars: standard 
deviation of the mean (blue). Red: fit to an OU process. Here iV = 1.5 x 10^ (compare 
with Fig. 5C in the main text). E Diffusion coefficient as a function of N, with fit to 
oc 1/N dependence in red (compare with Fig. 5D in the main text). F Absolute 
distance of the activity from the symmetry plane A = 0 , averaged over time and over 
connectivity instances, plotted vs. N. Red error bars represent the standard deviation of 
the mean. In all panels K = 1000 and iV = 1.5 x 10®. Other parameters are as in Fig. 2. 
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Fig S3. Network with random and sparse inhibitory connections between 
the sub-networks. A Projections of the nullclines rhi = 0 (blue) and = 0 (red) on 


the mi — m 3 plane, based on Eqs. 13 and 12 Here K = 1000, J = 1.8. Insets show a 


schematic illustration of the nullclines near the fixed points, in which the angle between 
the lines is amplified for clarity. B Population activities projected onto the mi — m 3 
plane. C Mean activities of the four populations: blue for one subnetwork and red for 
the other. The higher activities are those of the excitatory populations (mi and m 3 ). D 
Projection along the approximate attractor. E G{X = 0, At) vs. At as measured from 
simulations (black, with std of the mean errorbars in blue) and a fit to an OU process 
(red). (Compare with Fig. 5C in the main text.) F Diffusion coefficient as a function of 
N. Red: fit to oc 1/iV dependence (compare with Fig. 5D in the main text). Here 
K — 1000, A = 1.5 X 10®, J « 1.76, and all other parameters are as in Fig. 2. 
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Fig S4. Effects of correlated input noise. Results from simulations, for a 
network in which dynamical noise is added to the input Eq. The noise has a correlation 
time of 3t, which ensures that the correlation across neurons is not averaged out due to 
the asynchronous updating. The noise is described by an OU process: 

Tnoise^ = + o'noiseVit)^ where Tnoise = 30 ms, and ri(t) is a gaussian white noise, and 

the value of Unoise was varied to control the noise amplitude. A Mean activities of the 
four populations for cr = Eq/S: blue for one subnetwork and red for the other. The 
higher activities are those of the excitatory populations (mi and m 3 ). B G{X = 0, At) 
for a noise = 0 (black), a noise = Eo/30 (green) and a noise = Eq/S (blue). Error bars 
represent the standard deviation of the mean. Red: fits to an OU process. C Mean 
square displacement (MSB) of the location along the line for initial location A'(O) = 0.05. 
Colors are the same as in B. In this figure K = 500, N = 1.5 x 10®, J ~ 1.77. 
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